Search CORE

88 research outputs found

Canonical Representation Genetic Programming

Author: John R. Woodward
Ruibin Bai
Publication venue
Publication date: 01/01/2009
Field of study

Search spaces sampled by the process of Genetic Programming often consist of programs which can represent a function in many different ways. Thus, when the space is examined it is highly likely that different programs may be tested which represent the same function, which is an undesirable waste of resources. It is argued that, if a search space can be constructed where only unique representations of a function are permitted, then this will be more successful than employing multiple representations. When the search space consists of canonical representations it is called a canonical search space, and when Genetic Programming is applied to this search space, it is called Canonical Representation Genetic Programming. The challenge lies in constructing these search spaces. With some function sets this is a trivial task, and with some function sets this is impossible to achieve. With other function sets it is not clear how the goal can be achieved. In this paper, we specifically examine the search space defined by the function set {+, −, ∗, /} and the terminal set {x, 1}. Drawing inspiration from the fundamental theorem of arithmetic, and results regarding the fundamental theorem of algebra, we construct a representation where each function that can be constructed with this primitive set has a unique representation

CiteSeerX

Crossref

An investigation of novel approaches for optimising retail shelf space allocation

Author: Bai Ruibin
Publication venue
Publication date
Field of study

This thesis is concerned with real-world shelf space allocation problems that arise due to the conflict of limited shelf space availability and the large number of products that need to be displayed. Several important issues in the shelf space allocation problem are identified and two mathematical models are developed and studied. The first model deals with a general shelf space allocation problem while the second model specifically concerns shelf space allocation for fresh produce. Both models are closely related to the knapsack and bin packing problem. The thesis firstly studies a recently proposed generic search technique, hyper-heuristics, and introduces a simulated annealing acceptance criterion in order to improve its performance. The proposed algorithm, called simulated annealing hyper-heuristics, is initially tested on the one-dimensional bin packing problem, with very promising and competitive results being produced. The algorithm is then applied to the general shelf space allocation problem. The computational results show that the proposed algorithm is superior to a general simulated annealing algorithm and other types of hyper-heuristics. For the test data sets used in the thesis, the new approach solves every instance to over 98% of the upper bound which was obtained via a two-stage relaxation method. The thesis also studies and formulates a deterministic shelf space allocation and inventory model specifically for fresh produce. The model, for the first time, considers the freshness condition as an important factor in influencing a product's demand. Further analysis of the model shows that the search space of the problem can be reduced by decomposing the problem into a nonlinear knapsack problem and a single-item inventory problem that can be solved optimally by a binary search. Several heuristic and meta-heuristic approaches are utilised to optimise the model, including four efficient gradient based constructive heuristics, a multi-start generalised reduced gradient (GRG) algorithm, simulated annealing, a greedy randomised adaptive search procedure (GRASP) and three different types of hyper-heuristics. Experimental results show that the gradient based constructive heuristics are very efficient and all meta-heuristics can only marginally improve on them. Among these meta-heuristics, two simulated annealing based hyper-heuristic performs slightly better than the other meta-heuristic methods. Across all test instances of the three problems, it is shown that the introduction of simulated annealing in the current hyper-heuristics can indeed improve the performance of the algorithms. However, the simulated annealing hyper-heuristic with random heuristic selection generally performs best among all the other meta-heuristics implemented in this thesis. This research is funded by the Engineering and Physical Sciences Research Council (EPSRC) grant reference GR/R60577. Our industrial collaborators include Tesco Retail Vision and SpaceIT Solutions Ltd

Nottingham ePrints

A variable neighborhood search algorithm with reinforcement learning for a real-life periodic vehicle routing problem with time windows and open routes

Author: Bai Ruibin
Chen Binhui
Laesanklang Wasakorn
Qu Rong
Publication venue: 'EDP Sciences'
Publication date: 23/07/2020
Field of study

Based on a real-life container transport problem, a model of Open Periodic Vehicle Routing Problem with Time Windows (OPVRPTW) is proposed in this paper. In a wide planning horizon, which is divided into a number of shifts, a fixed number of trucks are scheduled to complete container transportation tasks between terminals subject to time constraints. In this problem, the routes traveled by trucks are open, as returning to the starting depot is not required in every single shift but every two shifts.Our study shows that it is unrealistic to address this large scale and nonlinearly constrained problem with exact search methods. A Reinforcement Learning Based Variable Neighbourhood Search algorithm (VNSRLS) is developed for OPVRPTW. The initial solution is constructed with an urgency level-based insertion heuristic, while different insertion selection strategies are compared. In the local search phase of VNS-RLS, reinforcement learning is used to guide the search, adjusting the probabilities of operators being invoked adaptively according to the change of generated solutions’ feasibility and quality. In addition, the impact of sampling neighbourhood space in single solution-based algorithms is also investigated. Three indicators are designed in the proposed Sampling module to set the starting configuration of local search.Experiment results on different sizes of real and artificial benchmark instances show that, the proposed Sampling scheme and feasibility indicator decrease the infeasible rate during the search. However, Sampling’s contribution to solution quality improvement is not significant in this single solution-based algorithm. Comparing to the exact search and two state-of-the-art algorithms, VNS-RLS produces promising result

Repository@Nottingham

EDP Sciences OAI-PMH repository (1.2.0)

A set-covering model for a bidirectional multi-shift full truckload vehicle routing problem

Author: Bai Ruibin
Chen Jianjun
Roberts Gethin Wyn
Xue Ning
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

This paper introduces a bidirectional multi-shift full truckload transportation problem with operation dependent service times. The problem is different from the previous container transport problems and the existing approaches for container transport problems and vehicle routing pickup and delivery are either not suitable or inefficient. In this paper, a set covering model is developed for the problem based on a novel route representation and a container-flow mapping. It was demonstrated that the model can be applied to solve real-life, medium sized instances of the container transport problem at a large international port. A lower bound of the problem is also obtained by relaxing the time window constraints to the nearest shifts and transforming the problem into a service network design problem. Implications and managerial insights of the results by the lower bound results are also provided

Nottingham ePrints

CiteSeerX

Nottingham eTheses

Repository@Nottingham

Fuzzy C-means-based scenario bundling for stochastic service network design

Author: Aickelin Uwe
Bai Ruibin
Jiang Xiaoping
Landa-Silva Dario
Publication venue
Publication date: 01/01/2017
Field of study

Stochastic service network designs with uncertain demand represented by a set of scenarios can be modelled as a large-scale two-stage stochastic mixed-integer program (SMIP). The progressive hedging algorithm (PHA) is a decomposition method for solving the resulting SMIP. The computational performance of the PHA can be greatly enhanced by decomposing according to scenario bundles instead of individual scenarios. At the heart of bundle-based decomposition is the method for grouping the scenarios into bundles. In this paper, we present a fuzzy c-means-based scenario bundling method to address this problem. Rather than full membership of a bundle, which is typically the case in existing scenario bundling strategies such as k-means, a scenario has partial membership in each of the bundles and can be assigned to more than one bundle in our method. Since the multiple bundle membership of a scenario induces overlap between the bundles, we empirically investigate whether and how the amount of overlap controlled by a fuzzy exponent would affect the performance of the PHA. Experimental results for a less-than-truckload transportation network optimization problem show that the number of iterations required by the PHA to achieve convergence reduces dramatically with large fuzzy exponents, whereas the computation time increases significantly. Experimental studies were conducted to find out a good fuzzy exponent to strike a trade-off between the solution quality and the computational time

Nottingham ePrints

Nottingham eTheses

Crossref

University of Melbourne Institutional Repository

Forecasting stock market return with nonlinearity: a genetic programming approach

Author: Bai Ruibin
Cui Tianxiang
Ding Shusheng
Xiong Xihan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2020
Field of study

The issue whether return in the stock market is predictable remains ambiguous. This paper attempts to establish new return forecasting models in order to contribute on addressing this issue. In contrast to existing literatures, we first reveal that the model forecasting accuracy can be improved through better model specification without adding any new variables. Instead of having a unified return forecasting model, we argue that stock markets in different countries shall have different forecasting models. Furthermore, we adopt an evolutionary procedure called Genetic programming (GP), to develop our new models with nonlinearity. Our newly-developed forecasting models are testified to be more accurate than traditional AR-family models. More importantly, the trading strategy we propose based on our forecasting models has been verified to be highly profitable in different types of stock markets in terms of stock index futures trading

Nottingham ePrints

Nottingham eTheses

Boosting the Discriminant Power of Naive Bayes

Author: Bai Ruibin
Jiang Xudong
Lian Xiaoyu
Ren Jianfeng
Wang Shihe
Publication venue
Publication date: 20/09/2022
Field of study

Naive Bayes has been widely used in many applications because of its simplicity and ability in handling both numerical data and categorical data. However, lack of modeling of correlations between features limits its performance. In addition, noise and outliers in the real-world dataset also greatly degrade the classification performance. In this paper, we propose a feature augmentation method employing a stack auto-encoder to reduce the noise in the data and boost the discriminant power of naive Bayes. The proposed stack auto-encoder consists of two auto-encoders for different purposes. The first encoder shrinks the initial features to derive a compact feature representation in order to remove the noise and redundant information. The second encoder boosts the discriminant power of the features by expanding them into a higher-dimensional space so that different classes of samples could be better separated in the higher-dimensional space. By integrating the proposed feature augmentation method with the regularized naive Bayes, the discrimination power of the model is greatly enhanced. The proposed method is evaluated on a set of machine-learning benchmark datasets. The experimental results show that the proposed method significantly and consistently outperforms the state-of-the-art naive Bayes classifiers.Comment: Accepted by 2022 International Conference on Pattern Recognitio

arXiv.org e-Print Archive

A dynamic truck dispatching problem in marine container terminal

Author: Bai Ruibin
Chen Jianjun
Dong Haibo
Kendall Graham
Qu Rong
Publication venue
Publication date: 09/12/2016
Field of study

In this paper, a dynamic truck dispatching problem of a marine container terminal is described and discussed. In this problem, a few containers, encoded as work instructions, need to be transferred between yard blocks and vessels by a fleet of trucks. Both the yard blocks and the quay are equipped with cranes to support loading/unloading operations. In order to service more vessels, any unnecessary idle time between quay crane (QC) operations need to be minimised to speed up the container transfer process. Due to the unpredictable port situations that can affect routing plans and the short calculation time allowed to generate one, static solution methods are not suitable for this problem. In this paper, we introduce a new mathematical model that minimises both the QC makespan and the truck travelling time. Three dynamic heuristics are proposed and a genetic algorithm hyperheuristic (GAHH) under development is also described. Experiment results show promising capabilities the GAHH may offer

Nottingham ePrints

Nottingham eTheses

Crossref

A Max-relevance-min-divergence Criterion for Data Discretization with Applications on Naive Bayes

Author: Bai Ruibin
Jiang Xudong
Ren Jianfeng
Wang Shihe
Yao Yuan
Publication venue
Publication date: 04/04/2023
Field of study

In many classification models, data is discretized to better estimate its distribution. Existing discretization methods often target at maximizing the discriminant power of discretized data, while overlooking the fact that the primary target of data discretization in classification is to improve the generalization performance. As a result, the data tend to be over-split into many small bins since the data without discretization retain the maximal discriminant information. Thus, we propose a Max-Dependency-Min-Divergence (MDmD) criterion that maximizes both the discriminant information and generalization ability of the discretized data. More specifically, the Max-Dependency criterion maximizes the statistical dependency between the discretized data and the classification variable while the Min-Divergence criterion explicitly minimizes the JS-divergence between the training data and the validation data for a given discretization scheme. The proposed MDmD criterion is technically appealing, but it is difficult to reliably estimate the high-order joint distributions of attributes and the classification variable. We hence further propose a more practical solution, Max-Relevance-Min-Divergence (MRmD) discretization scheme, where each attribute is discretized separately, by simultaneously maximizing the discriminant information and the generalization ability of the discretized data. The proposed MRmD is compared with the state-of-the-art discretization algorithms under the naive Bayes classification framework on 45 machine-learning benchmark datasets. It significantly outperforms all the compared methods on most of the datasets.Comment: Under major revision of Pattern Recognitio

arXiv.org e-Print Archive